HYDRAstor: A Scalable Secondary Storage

نویسندگان

Cezary Dubnicki

Leszek Gryz

Lukasz Heldt

Michal Kaczmarczyk

Wojciech Kilian

Przemyslaw Strzelczak

Jerzy Szczepkowski

Cristian Ungureanu

Michal Welnicki

چکیده

HYDRAstor is a scalable, secondary storage solution aimed at the enterprise market. The system consists of a back-end architectured as a grid of storage nodes built around a distributed hash table; and a front-end consisting of a layer of access nodes which implement a traditional file system interface and can be scaled in number for increased performance. This paper concentrates on the back-end which is, to our knowledge, the first commercial implementation of a scalable, high-performance content-addressable secondary storage delivering global duplicate elimination, per-block user-selectable failure resiliency, selfmaintenance including automatic recovery from failures with data and network overlay rebuilding. The back-end programming model is based on an abstraction of a sea of variable-sized, content-addressed, immutable, highly-resilient data blocks organized in a DAG (directed acyclic graph). This model is exported with a low-level API allowing clients to implement new access protocols and to add them to the system on-line. The API has been validated with an implementation of the file system interface. The critical factor for meeting the design targets has been the selection of proper data organization based on redundant chains of data containers. We present this organization in detail and describe how it is used to deliver required data services. Surprisingly, the most complex to deliver turned out to be on-demand data deletion, followed (not surprisingly) by the management of data consistency and integrity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HydraFS: A High-Throughput File System for the HYDRAstor Content-Addressable Storage System

A content-addressable storage (CAS) system is a valuable tool for building storage solutions, providing efficiency by automatically detecting and eliminating duplicate blocks; it can also be capable of high throughput, at least for streaming access. However, the absence of a standardized API is a barrier to the use of CAS for existing applications. Additionally, applications would have to deal ...

متن کامل

Impact of Data Organization on Distributed Storage System

With the explosive growth of data stored in digital format, there is a need for a new approach to data storage. Large amount of stored data requires modern storage systems to be scalable and easily extendable on-line. Moreover, the data must be resilient and highly available, which in turn requires failure-tolerant and highly available storage. To address these needs a new storage segment calle...

متن کامل

Resource Allocation in Selfish and Cooperative Distributed Systems

In this dissertation we take an algorithmic view on resource allocation problems in distributed systems. We present a comprehensive perspective by studying a variety of distributed systems—from abstract models of generic distributed systems, through more specific and detailed models, to real distributed computer systems. These systems differ with respect to the nature of the resource allocation...

متن کامل

An Efficient Secret Sharing-based Storage System for Cloud-based Internet of Things

Internet of things (IoTs) is the newfound information architecture based on the internet that develops interactions between objects and services in a secure and reliable environment. As the availability of many smart devices rises, secure and scalable mass storage systems for aggregate data is required in IoTs applications. In this paper, we propose a new method for storing aggregate data in Io...

متن کامل

Scalable Services for Video-on-demand

Video-on-demand (VOD) refers to video services in which users can request any video program from a server at any time. VOD has important applications in entertainment, education, information, and adverstising, such as movie-on-demand, distance learning, home shopping, interactive news, etc. In order to provide VOD services accommodating a large number of video titles and concurrent users, a VOD...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

HYDRAstor: A Scalable Secondary Storage

نویسندگان

چکیده

منابع مشابه

HydraFS: A High-Throughput File System for the HYDRAstor Content-Addressable Storage System

Impact of Data Organization on Distributed Storage System

Resource Allocation in Selfish and Cooperative Distributed Systems

An Efficient Secret Sharing-based Storage System for Cloud-based Internet of Things

Scalable Services for Video-on-demand

عنوان ژورنال:

اشتراک گذاری